{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ItXfxkxvosLH"
   },
   "source": [
    "# Training a Binary Classifier to Perform Sentiment Analysis\n",
    "\n",
    "## Overview\n",
    "\n",
    "This notebook demonstrates text classification starting from plain text files stored on disk. You'll train a binary classifier to perform sentiment analysis on an IMDB dataset. At the end of the notebook, there is an exercise for you to try, in which you'll train a multi-class classifier to predict the tag for a programming question on Stack Overflow.\n",
    "\n",
    "## Learning Objective\n",
    "\n",
    "In this notebook, you learn how to:\n",
    "\n",
    "1. Prepare the dataset for training\n",
    "2. Use loss function and optimizer\n",
    "3. Train the model\n",
    "4. Evaluate the model\n",
    "5. Export the model\n",
    "\n",
    "## Introduction\n",
    "\n",
    "This notebook shows how to train a sentiment analysis model to classify movie reviews as positive or negative, based on the text of the review.\n",
    "\n",
    "Each learning objective will correspond to a __#TODO__ in this student lab notebook -- try to complete this notebook first and then review the [solution notebook](../solutions/text_classification.ipynb)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "id": "8RZOuS9LWQvv"
   },
   "outputs": [],
   "source": [
    "# Import necessary libraries\n",
    "import matplotlib.pyplot as plt\n",
    "import os\n",
    "import re\n",
    "import shutil\n",
    "import string\n",
    "import tensorflow as tf\n",
    "\n",
    "from tensorflow.keras import layers\n",
    "from tensorflow.keras import losses\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "id": "6-tTFS04dChr"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2.6.3\n"
     ]
    }
   ],
   "source": [
    "# Print the TensorFlow version\n",
    "print(tf.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "NBTI1bi8qdFV"
   },
   "source": [
    "## Sentiment analysis\n",
    "\n",
    "This notebook trains a sentiment analysis model to classify movie reviews as *positive* or *negative*, based on the text of the review. This is an example of *binary*—or two-class—classification, an important and widely applicable kind of machine learning problem.\n",
    "\n",
    "You'll use the [Large Movie Review Dataset](https://ai.stanford.edu/~amaas/data/sentiment/) that contains the text of 50,000 movie reviews from the [Internet Movie Database](https://www.imdb.com/). These are split into 25,000 reviews for training and 25,000 reviews for testing. The training and testing sets are *balanced*, meaning they contain an equal number of positive and negative reviews.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "iAsKG535pHep"
   },
   "source": [
    "### Download and explore the IMDB dataset\n",
    "\n",
    "Let's download and extract the dataset, then explore the directory structure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "id": "k7ZYnuajVlFN"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Downloading data from https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\n",
      "84131840/84125825 [==============================] - 2s 0us/step\n",
      "84140032/84125825 [==============================] - 2s 0us/step\n"
     ]
    }
   ],
   "source": [
    "# Download the IMDB dataset\n",
    "url = \"https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\"\n",
    "\n",
    "dataset = tf.keras.utils.get_file(\"aclImdb_v1\", url,\n",
    "                                    untar=True, cache_dir='.',\n",
    "                                    cache_subdir='')\n",
    "\n",
    "dataset_dir = os.path.join(os.path.dirname(dataset), 'aclImdb')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "id": "355CfOvsV1pl"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['test', 'imdb.vocab', 'imdbEr.txt', 'train', 'README']"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Explore the dataset\n",
    "os.listdir(dataset_dir)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "id": "7ASND15oXpF1"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['unsupBow.feat',\n",
       " 'unsup',\n",
       " 'urls_unsup.txt',\n",
       " 'labeledBow.feat',\n",
       " 'pos',\n",
       " 'neg',\n",
       " 'urls_neg.txt',\n",
       " 'urls_pos.txt']"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_dir = os.path.join(dataset_dir, 'train')\n",
    "os.listdir(train_dir)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ysMNMI1CWDFD"
   },
   "source": [
    "The `aclImdb/train/pos` and `aclImdb/train/neg` directories contain many text files, each of which is a single movie review. Let's take a look at one of them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "id": "R7g8hFvzWLIZ"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Rachel Griffiths writes and directs this award winning short film. A heartwarming story about coping with grief and cherishing the memory of those we've loved and lost. Although, only 15 minutes long, Griffiths manages to capture so much emotion and truth onto film in the short space of time. Bud Tingwell gives a touching performance as Will, a widower struggling to cope with his wife's death. Will is confronted by the harsh reality of loneliness and helplessness as he proceeds to take care of Ruth's pet cow, Tulip. The film displays the grief and responsibility one feels for those they have loved and lost. Good cinematography, great direction, and superbly acted. It will bring tears to all those who have lost a loved one, and survived.\n"
     ]
    }
   ],
   "source": [
    "# Print the file content\n",
    "sample_file = os.path.join(train_dir, 'pos/1181_9.txt')\n",
    "with open(sample_file) as f:\n",
    "  print(f.read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Mk20TEm6ZRFP"
   },
   "source": [
    "### Load the dataset\n",
    "\n",
    "Next, you will load the data off disk and prepare it into a format suitable for training. To do so, you will use the helpful [text_dataset_from_directory](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/text_dataset_from_directory) utility, which expects a directory structure as follows.\n",
    "\n",
    "```\n",
    "main_directory/\n",
    "...class_a/\n",
    "......a_text_1.txt\n",
    "......a_text_2.txt\n",
    "...class_b/\n",
    "......b_text_1.txt\n",
    "......b_text_2.txt\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nQauv38Lnok3"
   },
   "source": [
    "To prepare a dataset for binary classification, you will need two folders on disk, corresponding to `class_a` and `class_b`. These will be the positive and negative movie reviews, which can be found in  `aclImdb/train/pos` and `aclImdb/train/neg`. As the IMDB dataset contains additional folders, you will remove them before using this utility."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "id": "VhejsClzaWfl"
   },
   "outputs": [],
   "source": [
    "remove_dir = os.path.join(train_dir, 'unsup')\n",
    "shutil.rmtree(remove_dir)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "95kkUdRoaeMw"
   },
   "source": [
    "Next, you will use the `text_dataset_from_directory` utility to create a labeled `tf.data.Dataset`. [tf.data](https://www.tensorflow.org/guide/data) is a powerful collection of tools for working with data. \n",
    "\n",
    "When running a machine learning experiment, it is a best practice to divide your dataset into three splits: [train](https://developers.google.com/machine-learning/glossary#training_set), [validation](https://developers.google.com/machine-learning/glossary#validation_set), and [test](https://developers.google.com/machine-learning/glossary#test-set). \n",
    "\n",
    "The IMDB dataset has already been divided into train and test, but it lacks a validation set. Let's create a validation set using an 80:20 split of the training data by using the `validation_split` argument below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "id": "nOrK-MTYaw3C"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Found 25000 files belonging to 2 classes.\n",
      "Using 20000 files for training.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2022-03-30 12:30:34.556777: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.\n"
     ]
    }
   ],
   "source": [
    "# Create the validation set\n",
    "batch_size = 32\n",
    "seed = 42\n",
    "\n",
    "raw_train_ds = tf.keras.utils.text_dataset_from_directory(\n",
    "    'aclImdb/train', \n",
    "    batch_size=batch_size, \n",
    "    validation_split=0.2, \n",
    "    subset='training', \n",
    "    seed=seed)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "5Y33oxOUpYkh"
   },
   "source": [
    "As you can see above, there are 25,000 examples in the training folder, of which you will use 80% (or 20,000) for training. As you will see in a moment, you can train a model by passing a dataset directly to `model.fit`. If you're new to `tf.data`, you can also iterate over the dataset and print out a few examples as follows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "id": "51wNaPPApk1K"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Review b'\"Pandemonium\" is a horror movie spoof that comes off more stupid than funny. Believe me when I tell you, I love comedies. Especially comedy spoofs. \"Airplane\", \"The Naked Gun\" trilogy, \"Blazing Saddles\", \"High Anxiety\", and \"Spaceballs\" are some of my favorite comedies that spoof a particular genre. \"Pandemonium\" is not up there with those films. Most of the scenes in this movie had me sitting there in stunned silence because the movie wasn\\'t all that funny. There are a few laughs in the film, but when you watch a comedy, you expect to laugh a lot more than a few times and that\\'s all this film has going for it. Geez, \"Scream\" had more laughs than this film and that was more of a horror film. How bizarre is that?<br /><br />*1/2 (out of four)'\n",
      "Label 0\n",
      "Review b\"David Mamet is a very interesting and a very un-equal director. His first movie 'House of Games' was the one I liked best, and it set a series of films with characters whose perspective of life changes as they get into complicated situations, and so does the perspective of the viewer.<br /><br />So is 'Homicide' which from the title tries to set the mind of the viewer to the usual crime drama. The principal characters are two cops, one Jewish and one Irish who deal with a racially charged area. The murder of an old Jewish shop owner who proves to be an ancient veteran of the Israeli Independence war triggers the Jewish identity in the mind and heart of the Jewish detective.<br /><br />This is were the flaws of the film are the more obvious. The process of awakening is theatrical and hard to believe, the group of Jewish militants is operatic, and the way the detective eventually walks to the final violent confrontation is pathetic. The end of the film itself is Mamet-like smart, but disappoints from a human emotional perspective.<br /><br />Joe Mantegna and William Macy give strong performances, but the flaws of the story are too evident to be easily compensated.\"\n",
      "Label 0\n",
      "Review b'Great documentary about the lives of NY firefighters during the worst terrorist attack of all time.. That reason alone is why this should be a must see collectors item.. What shocked me was not only the attacks, but the\"High Fat Diet\" and physical appearance of some of these firefighters. I think a lot of Doctors would agree with me that,in the physical shape they were in, some of these firefighters would NOT of made it to the 79th floor carrying over 60 lbs of gear. Having said that i now have a greater respect for firefighters and i realize becoming a firefighter is a life altering job. The French have a history of making great documentary\\'s and that is what this is, a Great Documentary.....'\n",
      "Label 1\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2022-03-30 12:30:40.050775: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)\n"
     ]
    }
   ],
   "source": [
    "# Print few examples\n",
    "for text_batch, label_batch in raw_train_ds.take(1):\n",
    "  for i in range(3):\n",
    "    print(\"Review\", text_batch.numpy()[i])\n",
    "    print(\"Label\", label_batch.numpy()[i])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "JWq1SUIrp1a-"
   },
   "source": [
    "Notice the reviews contain raw text (with punctuation and occasional HTML tags like `<br/>`). You will show how to handle these in the following section. \n",
    "\n",
    "The labels are 0 or 1. To see which of these correspond to positive and negative movie reviews, you can check the `class_names` property on the dataset.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "id": "MlICTG8spyO2"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Label 0 corresponds to neg\n",
      "Label 1 corresponds to pos\n"
     ]
    }
   ],
   "source": [
    "print(\"Label 0 corresponds to\", raw_train_ds.class_names[0])\n",
    "print(\"Label 1 corresponds to\", raw_train_ds.class_names[1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pbdO39vYqdJr"
   },
   "source": [
    "Next, you will create a validation and test dataset. You will use the remaining 5,000 reviews from the training set for validation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "SzxazN8Hq1pF"
   },
   "source": [
    "Note:  When using the `validation_split` and `subset` arguments, make sure to either specify a random seed, or to pass `shuffle=False`, so that the validation and training splits have no overlap."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "id": "JsMwwhOoqjKF"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Found 25000 files belonging to 2 classes.\n",
      "Using 5000 files for validation.\n"
     ]
    }
   ],
   "source": [
    "raw_val_ds = tf.keras.utils.text_dataset_from_directory(\n",
    "    'aclImdb/train', \n",
    "    batch_size=batch_size, \n",
    "    validation_split=0.2, \n",
    "    subset='validation', \n",
    "    seed=seed)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "id": "rdSr0Nt3q_ns"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Found 25000 files belonging to 2 classes.\n"
     ]
    }
   ],
   "source": [
    "raw_test_ds = tf.keras.utils.text_dataset_from_directory(\n",
    "    'aclImdb/test', \n",
    "    batch_size=batch_size)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qJmTiO0IYAjm"
   },
   "source": [
    "### Prepare the dataset for training\n",
    "\n",
    "Next, you will standardize, tokenize, and vectorize the data using the helpful `tf.keras.layers.TextVectorization` layer. \n",
    "\n",
    "Standardization refers to preprocessing the text, typically to remove punctuation or HTML elements to simplify the dataset. Tokenization refers to splitting strings into tokens (for example, splitting a sentence into individual words, by splitting on whitespace). Vectorization refers to converting tokens into numbers so they can be fed into a neural network. All of these tasks can be accomplished with this layer.\n",
    "\n",
    "As you saw above, the reviews contain various HTML tags like `<br />`. These tags will not be removed by the default standardizer in the `TextVectorization` layer (which converts text to lowercase and strips punctuation by default, but doesn't strip HTML). You will write a custom standardization function to remove the HTML."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ZVcHl-SLrH-u"
   },
   "source": [
    "Note: To prevent [training-testing skew](https://developers.google.com/machine-learning/guides/rules-of-ml#training-serving_skew) (also known as training-serving skew), it is important to preprocess the data identically at train and test time. To facilitate this, the `TextVectorization` layer can be included directly inside your model, as shown later in this tutorial."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "id": "SDRI_s_tX1Hk"
   },
   "outputs": [],
   "source": [
    "def custom_standardization(input_data):\n",
    "  lowercase = tf.strings.lower(input_data)\n",
    "  stripped_html = tf.strings.regex_replace(lowercase, '<br />', ' ')\n",
    "  return tf.strings.regex_replace(stripped_html,\n",
    "                                  '[%s]' % re.escape(string.punctuation),\n",
    "                                  '')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "d2d3Aw8dsUux"
   },
   "source": [
    "Next, you will create a `TextVectorization` layer. You will use this layer to standardize, tokenize, and vectorize our data. You set the `output_mode` to `int` to create unique integer indices for each token.\n",
    "\n",
    "Note that you're using the default split function, and the custom standardization function you defined above. You'll also define some constants for the model, like an explicit maximum `sequence_length`, which will cause the layer to pad or truncate sequences to exactly `sequence_length` values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "id": "-c76RvSzsMnX"
   },
   "outputs": [],
   "source": [
    "max_features = 10000\n",
    "sequence_length = 250\n",
    "\n",
    "# Created the TextVectorization layer\n",
    "vectorize_layer = # TODO 1 -- Your code goes here(\n",
    "    standardize=custom_standardization,\n",
    "    max_tokens=max_features,\n",
    "    output_mode='int',\n",
    "    output_sequence_length=sequence_length)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vlFOpfF6scT6"
   },
   "source": [
    "Next, you will call `adapt` to fit the state of the preprocessing layer to the dataset. This will cause the model to build an index of strings to integers."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "lAhdjK7AtroA"
   },
   "source": [
    "Note: It's important to only use your training data when calling adapt (using the test set would leak information)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "id": "GH4_2ZGJsa_X"
   },
   "outputs": [],
   "source": [
    "# Make a text-only dataset (without labels), then call adapt\n",
    "train_text = raw_train_ds.map(lambda x, y: x)\n",
    "vectorize_layer.adapt(train_text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "SHQVEFzNt-K_"
   },
   "source": [
    "Let's create a function to see the result of using this layer to preprocess some data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "id": "SCIg_T50wOCU"
   },
   "outputs": [],
   "source": [
    "def vectorize_text(text, label):\n",
    "  text = tf.expand_dims(text, -1)\n",
    "  return vectorize_layer(text), label"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "id": "XULcm6B3xQIO"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Review tf.Tensor(b'Great movie - especially the music - Etta James - \"At Last\". This speaks volumes when you have finally found that special someone.', shape=(), dtype=string)\n",
      "Label neg\n",
      "Vectorized review (<tf.Tensor: shape=(1, 250), dtype=int64, numpy=\n",
      "array([[  86,   17,  260,    2,  222,    1,  571,   31,  229,   11, 2418,\n",
      "           1,   51,   22,   25,  404,  251,   12,  306,  282,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,\n",
      "           0,    0,    0,    0,    0,    0,    0,    0]])>, <tf.Tensor: shape=(), dtype=int32, numpy=0>)\n"
     ]
    }
   ],
   "source": [
    "# retrieve a batch (of 32 reviews and labels) from the dataset\n",
    "text_batch, label_batch = next(iter(raw_train_ds))\n",
    "first_review, first_label = text_batch[0], label_batch[0]\n",
    "print(\"Review\", first_review)\n",
    "print(\"Label\", raw_train_ds.class_names[first_label])\n",
    "print(\"Vectorized review\", vectorize_text(first_review, first_label))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "6u5EX0hxyNZT"
   },
   "source": [
    "As you can see above, each token has been replaced by an integer. You can lookup the token (string) that each integer corresponds to by calling `.get_vocabulary()` on the layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "id": "kRq9hTQzhVhW"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1287 --->  silent\n",
      " 313 --->  night\n",
      "Vocabulary size: 10000\n"
     ]
    }
   ],
   "source": [
    "# Print the token (string) that each integer corresponds\n",
    "print(\"1287 ---> \",vectorize_layer.get_vocabulary()[1287])\n",
    "print(\" 313 ---> \",vectorize_layer.get_vocabulary()[313])\n",
    "print('Vocabulary size: {}'.format(len(vectorize_layer.get_vocabulary())))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XD2H6utRydGv"
   },
   "source": [
    "You are nearly ready to train your model. As a final preprocessing step, you will apply the TextVectorization layer you created earlier to the train, validation, and test dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "id": "2zhmpeViI1iG"
   },
   "outputs": [],
   "source": [
    "# Apply the TextVectorization layer you created earlier to the train, validation, and test dataset\n",
    "train_ds = raw_train_ds.map(vectorize_text)\n",
    "val_ds = raw_val_ds.map(vectorize_text)\n",
    "test_ds = raw_test_ds.map(vectorize_text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "YsVQyPMizjuO"
   },
   "source": [
    "### Configure the dataset for performance\n",
    "\n",
    "These are two important methods you should use when loading data to make sure that I/O does not become blocking.\n",
    "\n",
    "`.cache()` keeps data in memory after it's loaded off disk. This will ensure the dataset does not become a bottleneck while training your model. If your dataset is too large to fit into memory, you can also use this method to create a performant on-disk cache, which is more efficient to read than many small files.\n",
    "\n",
    "`.prefetch()` overlaps data preprocessing and model execution while training. \n",
    "\n",
    "You can learn more about both methods, as well as how to cache data to disk in the [data performance guide](https://www.tensorflow.org/guide/data_performance)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "id": "wMcs_H7izm5m"
   },
   "outputs": [],
   "source": [
    "AUTOTUNE = tf.data.AUTOTUNE\n",
    "\n",
    "train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)\n",
    "val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)\n",
    "test_ds = test_ds.cache().prefetch(buffer_size=AUTOTUNE)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "LLC02j2g-llC"
   },
   "source": [
    "### Create the model\n",
    "\n",
    "It's time to create your neural network:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "id": "dkQP6in8yUBR"
   },
   "outputs": [],
   "source": [
    "embedding_dim = 16"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "id": "xpKOoWgu-llD"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model: \"sequential\"\n",
      "_________________________________________________________________\n",
      "Layer (type)                 Output Shape              Param #   \n",
      "=================================================================\n",
      "embedding (Embedding)        (None, None, 16)          160016    \n",
      "_________________________________________________________________\n",
      "dropout (Dropout)            (None, None, 16)          0         \n",
      "_________________________________________________________________\n",
      "global_average_pooling1d (Gl (None, 16)                0         \n",
      "_________________________________________________________________\n",
      "dropout_1 (Dropout)          (None, 16)                0         \n",
      "_________________________________________________________________\n",
      "dense (Dense)                (None, 1)                 17        \n",
      "=================================================================\n",
      "Total params: 160,033\n",
      "Trainable params: 160,033\n",
      "Non-trainable params: 0\n",
      "_________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "# Create your neural network\n",
    "model = tf.keras.Sequential([\n",
    "  layers.Embedding(max_features + 1, embedding_dim),\n",
    "  layers.Dropout(0.2),\n",
    "  layers.GlobalAveragePooling1D(),\n",
    "  layers.Dropout(0.2),\n",
    "  layers.Dense(1)])\n",
    "\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "6PbKQ6mucuKL"
   },
   "source": [
    "The layers are stacked sequentially to build the classifier:\n",
    "\n",
    "1. The first layer is an `Embedding` layer. This layer takes the integer-encoded reviews and looks up an embedding vector for each word-index. These vectors are learned as the model trains. The vectors add a dimension to the output array. The resulting dimensions are: `(batch, sequence, embedding)`.  To learn more about embeddings, check out the [Word embeddings](https://www.tensorflow.org/text/guide/word_embeddings) tutorial.\n",
    "2. Next, a `GlobalAveragePooling1D` layer returns a fixed-length output vector for each example by averaging over the sequence dimension. This allows the model to handle input of variable length, in the simplest way possible.\n",
    "3. This fixed-length output vector is piped through a fully-connected (`Dense`) layer with 16 hidden units. \n",
    "4. The last layer is densely connected with a single output node."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "L4EqVWg4-llM"
   },
   "source": [
    "### Loss function and optimizer\n",
    "\n",
    "A model needs a loss function and an optimizer for training. Since this is a binary classification problem and the model outputs a probability (a single-unit layer with a sigmoid activation), you'll use `losses.BinaryCrossentropy` loss function.\n",
    "\n",
    "Now, configure the model to use an optimizer and a loss function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "id": "Mr0GP-cQ-llN"
   },
   "outputs": [],
   "source": [
    "# Configure the model to use an optimizer and a loss function\n",
    "model.compile(loss=# TODO 2 -- Your code goes here(from_logits=True),\n",
    "              optimizer='adam',\n",
    "              metrics=tf.metrics.BinaryAccuracy(threshold=0.0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "35jv_fzP-llU"
   },
   "source": [
    "### Train the model\n",
    "\n",
    "You will train the model by passing the `dataset` object to the fit method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "id": "tXSGrjWZ-llW"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 1/10\n",
      "625/625 [==============================] - 4s 6ms/step - loss: 0.6641 - binary_accuracy: 0.6956 - val_loss: 0.6142 - val_binary_accuracy: 0.7722\n",
      "Epoch 2/10\n",
      "625/625 [==============================] - 2s 4ms/step - loss: 0.5473 - binary_accuracy: 0.8006 - val_loss: 0.4975 - val_binary_accuracy: 0.8220\n",
      "Epoch 3/10\n",
      "625/625 [==============================] - 2s 4ms/step - loss: 0.4438 - binary_accuracy: 0.8454 - val_loss: 0.4198 - val_binary_accuracy: 0.8474\n",
      "Epoch 4/10\n",
      "625/625 [==============================] - 2s 4ms/step - loss: 0.3784 - binary_accuracy: 0.8658 - val_loss: 0.3736 - val_binary_accuracy: 0.8604\n",
      "Epoch 5/10\n",
      "625/625 [==============================] - 2s 4ms/step - loss: 0.3352 - binary_accuracy: 0.8792 - val_loss: 0.3450 - val_binary_accuracy: 0.8668\n",
      "Epoch 6/10\n",
      "625/625 [==============================] - 2s 4ms/step - loss: 0.3050 - binary_accuracy: 0.8889 - val_loss: 0.3260 - val_binary_accuracy: 0.8708\n",
      "Epoch 7/10\n",
      "625/625 [==============================] - 2s 4ms/step - loss: 0.2817 - binary_accuracy: 0.8964 - val_loss: 0.3127 - val_binary_accuracy: 0.8730\n",
      "Epoch 8/10\n",
      "625/625 [==============================] - 2s 4ms/step - loss: 0.2622 - binary_accuracy: 0.9033 - val_loss: 0.3035 - val_binary_accuracy: 0.8756\n",
      "Epoch 9/10\n",
      "625/625 [==============================] - 2s 4ms/step - loss: 0.2462 - binary_accuracy: 0.9104 - val_loss: 0.2970 - val_binary_accuracy: 0.8772\n",
      "Epoch 10/10\n",
      "625/625 [==============================] - 2s 4ms/step - loss: 0.2319 - binary_accuracy: 0.9158 - val_loss: 0.2922 - val_binary_accuracy: 0.8776\n"
     ]
    }
   ],
   "source": [
    "# Train the model\n",
    "epochs = 10\n",
    "history = # TODO 3 -- Your code goes here(\n",
    "    train_ds,\n",
    "    validation_data=val_ds,\n",
    "    epochs=epochs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "9EEGuDVuzb5r"
   },
   "source": [
    "### Evaluate the model\n",
    "\n",
    "Let's see how the model performs. Two values will be returned. Loss (a number which represents our error, lower values are better), and accuracy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "id": "zOMKywn4zReN"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "782/782 [==============================] - 2s 2ms/step - loss: 0.3105 - binary_accuracy: 0.8735\n",
      "Loss:  0.31049710512161255\n",
      "Accuracy:  0.873520016670227\n"
     ]
    }
   ],
   "source": [
    "# Evaluate the model\n",
    "loss, accuracy = # TODO 4 -- Your code goes here(test_ds)\n",
    "\n",
    "print(\"Loss: \", loss)\n",
    "print(\"Accuracy: \", accuracy)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "z1iEXVTR0Z2t"
   },
   "source": [
    "This fairly naive approach achieves an accuracy of about 86%."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ldbQqCw2Xc1W"
   },
   "source": [
    "### Create a plot of accuracy and loss over time\n",
    "\n",
    "`model.fit()` returns a `History` object that contains a dictionary with everything that happened during training:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "id": "-YcvZsdvWfDf"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dict_keys(['loss', 'binary_accuracy', 'val_loss', 'val_binary_accuracy'])"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "history_dict = history.history\n",
    "history_dict.keys()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1_CH32qJXruI"
   },
   "source": [
    "There are four entries: one for each monitored metric during training and validation. You can use these to plot the training and validation loss for comparison, as well as the training and validation accuracy:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "id": "2SEMeQ5YXs8z"
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEWCAYAAABrDZDcAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAArjklEQVR4nO3deXhU5dnH8e/NvoOyKIIkoCKCQMCALIK4taKouFUxohQV0VZFfBWVKtRKa5W2vFhccMGlWLRqqVtdEBG3t4qICwgVNWAUFaJsgrLd7x/PhExCVsjkTDK/z3XNNWfOnHPmngnMPc9u7o6IiKSuGlEHICIi0VIiEBFJcUoEIiIpTolARCTFKRGIiKQ4JQIRkRSnRCAVysz+bWbnV/SxUTKzbDM7NgHXdTM7MLZ9l5ndUJZjd+N1sszsxd2Ns4TrDjKznIq+rlS+WlEHINEzs41xDxsAPwHbY48vdveZZb2Wuw9OxLHVnbuProjrmFk68DlQ2923xa49Eyjz31BSjxKB4O6N8rbNLBu40N3nFD7OzGrlfbmISPWhqiEpVl7R38zGmdnXwAwz28vMnjGz1Wb2fWy7bdw588zswtj2CDN73cwmx4793MwG7+ax7c1svpltMLM5ZjbNzP5WTNxlifF3ZvZG7HovmlmLuOeHm9kKM8s1s/ElfD59zOxrM6sZt+9UM/sgtt3bzN4ys7VmtsrM/mpmdYq51gNmdnPc46tj53xlZiMLHXuimb1nZuvN7Aszmxj39PzY/Voz22hmffM+27jz+5nZO2a2Lnbfr6yfTUnM7JDY+WvNbLGZnRz33AlmtiR2zS/N7H9i+1vE/j5rzew7M3vNzPS9VMn0gUtp9gX2BtKAUYR/MzNij9sBm4G/lnD+4cAyoAVwK3CfmdluHPsI8DbQHJgIDC/hNcsS4znAL4FWQB0g74upM3Bn7Pr7xV6vLUVw9/8DfgCOLnTdR2Lb24ErY++nL3AMcGkJcROL4fhYPMcBBwGF2yd+AM4DmgEnApeY2dDYcwNj983cvZG7v1Xo2nsDzwJTY+/tz8CzZta80HvY5bMpJebawNPAi7HzLgNmmtnBsUPuI1QzNgYOBebG9l8F5AAtgX2A6wHNe1PJlAikNDuACe7+k7tvdvdcd3/C3Te5+wZgEnBkCeevcPd73H078CDQmvAfvszHmlk7oBdwo7tvcffXgaeKe8EyxjjD3f/r7puBx4CM2P4zgGfcfb67/wTcEPsMivN3YBiAmTUGTojtw93fdff/c/dt7p4N3F1EHEX5RSy+j9z9B0Lii39/89z9Q3ff4e4fxF6vLNeFkDg+cfeHY3H9HVgKnBR3THGfTUn6AI2AW2J/o7nAM8Q+G2Ar0NnMmrj79+6+MG5/ayDN3be6+2uuCdAqnRKBlGa1u/+Y98DMGpjZ3bGqk/WEqohm8dUjhXydt+Hum2Kbjcp57H7Ad3H7AL4oLuAyxvh13PamuJj2i7927Is4t7jXIvz6P83M6gKnAQvdfUUsjo6xao+vY3H8nlA6KE2BGIAVhd7f4Wb2Sqzqax0wuozXzbv2ikL7VgBt4h4X99mUGrO7xyfN+OueTkiSK8zsVTPrG9t/G7AceNHMPjOza8v2NqQiKRFIaQr/OrsKOBg43N2bkF8VUVx1T0VYBextZg3i9u1fwvF7EuOq+GvHXrN5cQe7+xLCF95gClYLQahiWgocFIvj+t2JgVC9Fe8RQolof3dvCtwVd93Sfk1/Ragyi9cO+LIMcZV23f0L1e/vvK67v+PupxCqjWYTShq4+wZ3v8rdOxBKJWPN7Jg9jEXKSYlAyqsxoc59bay+eUKiXzD2C3sBMNHM6sR+TZ5Uwil7EuPjwBAzOyLWsHsTpf8/eQS4nJBw/lEojvXARjPrBFxSxhgeA0aYWedYIiocf2NCCelHM+tNSEB5VhOqsjoUc+3ngI5mdo6Z1TKzs4DOhGqcPfEfQtvFNWZW28wGEf5Gs2J/sywza+ruWwmfyXYAMxtiZgfG2oLy9m8v8hUkYZQIpLymAPWBNcD/Ac9X0utmERpcc4GbgUcJ4x2KMoXdjNHdFwO/Iny5rwK+JzRmluTvwCBgrruvidv/P4Qv6Q3APbGYyxLDv2PvYS6h2mRuoUMuBW4ysw3AjcR+XcfO3URoE3kj1hOnT6Fr5wJDCKWmXOAaYEihuMvN3bcAJxNKRmuAO4Dz3H1p7JDhQHasimw0cG5s/0HAHGAj8BZwh7vP25NYpPxM7TJSFZnZo8BSd094iUSkulOJQKoEM+tlZgeYWY1Y98pTCHXNIrKHNLJYqop9gScJDbc5wCXu/l60IYlUD6oaEhFJcaoaEhFJcVWuaqhFixaenp4edRgiIlXKu+++u8bdWxb1XJVLBOnp6SxYsCDqMEREqhQzKzyifCdVDYmIpDglAhGRFKdEICKS4qpcG4GIVL6tW7eSk5PDjz/+WPrBEql69erRtm1bateuXeZzlAhEpFQ5OTk0btyY9PR0il9XSKLm7uTm5pKTk0P79u3LfF5KVA3NnAnp6VCjRrifqWW8Rcrlxx9/pHnz5koCSc7MaN68eblLbtW+RDBzJowaBZtiS5qsWBEeA2RlRReXSFWjJFA17M7fqdqXCMaPz08CeTZtCvtFRCQFEsHKleXbLyLJJzc3l4yMDDIyMth3331p06bNzsdbtmwp8dwFCxZw+eWXl/oa/fr1q5BY582bx5AhQyrkWpWl2ieCdoUX+Stlv4jsuYpul2vevDmLFi1i0aJFjB49miuvvHLn4zp16rBt27Ziz83MzGTq1Kmlvsabb765Z0FWYdU+EUyaBA0aFNzXoEHYLyIVL69dbsUKcM9vl6voThojRoxg7NixHHXUUYwbN463336bfv360aNHD/r168eyZcuAgr/QJ06cyMiRIxk0aBAdOnQokCAaNWq08/hBgwZxxhln0KlTJ7Kyssibpfm5556jU6dOHHHEEVx++eWl/vL/7rvvGDp0KN26daNPnz588MEHALz66qs7SzQ9evRgw4YNrFq1ioEDB5KRkcGhhx7Ka6+9VrEfWAmqfWNxXoPw+PGhOqhdu5AE1FAskhgltctV9P+7//73v8yZM4eaNWuyfv165s+fT61atZgzZw7XX389TzzxxC7nLF26lFdeeYUNGzZw8MEHc8kll+zS5/69995j8eLF7LfffvTv35833niDzMxMLr74YubPn0/79u0ZNmxYqfFNmDCBHj16MHv2bObOnct5553HokWLmDx5MtOmTaN///5s3LiRevXqMX36dH7+858zfvx4tm/fzqbCH2ICVftEAOEfn774RSpHZbbLnXnmmdSsWROAdevWcf755/PJJ59gZmzdurXIc0488UTq1q1L3bp1adWqFd988w1t27YtcEzv3r137svIyCA7O5tGjRrRoUOHnf3zhw0bxvTp00uM7/XXX9+ZjI4++mhyc3NZt24d/fv3Z+zYsWRlZXHaaafRtm1bevXqxciRI9m6dStDhw4lIyNjTz6acqn2VUMiUrkqs12uYcOGO7dvuOEGjjrqKD766COefvrpYvvS161bd+d2zZo1i2xfKOqY3VnEq6hzzIxrr72We++9l82bN9OnTx+WLl3KwIEDmT9/Pm3atGH48OE89NBD5X693aVEICIVKqp2uXXr1tGmTRsAHnjggQq/fqdOnfjss8/Izs4G4NFHHy31nIEDBzIz1jgyb948WrRoQZMmTfj000/p2rUr48aNIzMzk6VLl7JixQpatWrFRRddxAUXXMDChQsr/D0UR4lARCpUVhZMnw5paWAW7qdPT3z17DXXXMN1111H//792b59e4Vfv379+txxxx0cf/zxHHHEEeyzzz40bdq0xHMmTpzIggUL6NatG9deey0PPvggAFOmTOHQQw+le/fu1K9fn8GDBzNv3rydjcdPPPEEV1xxRYW/h+JUuTWLMzMzXQvTiFSujz/+mEMOOSTqMCK3ceNGGjVqhLvzq1/9ioMOOogrr7wy6rB2UdTfy8zedffMoo5XiUBEpIzuueceMjIy6NKlC+vWrePiiy+OOqQKkRK9hkREKsKVV16ZlCWAPaUSgYhIilMiEBFJcUoEIiIpTolARCTFKRGISNIbNGgQL7zwQoF9U6ZM4dJLLy3xnLyu5ieccAJr167d5ZiJEycyefLkEl979uzZLFmyZOfjG2+8kTlz5pQj+qIl03TVSgQikvSGDRvGrFmzCuybNWtWmSZ+gzBraLNmzXbrtQsngptuuoljjz12t66VrJQIRCTpnXHGGTzzzDP89NNPAGRnZ/PVV19xxBFHcMkll5CZmUmXLl2YMGFCkeenp6ezZs0aACZNmsTBBx/Mscceu3OqaghjBHr16kX37t05/fTT2bRpE2+++SZPPfUUV199NRkZGXz66aeMGDGCxx9/HICXX36ZHj160LVrV0aOHLkzvvT0dCZMmEDPnj3p2rUrS5cuLfH9RT1dtcYRiEi5jBkDixZV7DUzMmDKlOKfb968Ob179+b555/nlFNOYdasWZx11lmYGZMmTWLvvfdm+/btHHPMMXzwwQd069atyOu8++67zJo1i/fee49t27bRs2dPDjvsMABOO+00LrroIgB+85vfcN9993HZZZdx8sknM2TIEM4444wC1/rxxx8ZMWIEL7/8Mh07duS8887jzjvvZMyYMQC0aNGChQsXcscddzB58mTuvffeYt9f1NNVq0QgIlVCfPVQfLXQY489Rs+ePenRoweLFy8uUI1T2Guvvcapp55KgwYNaNKkCSeffPLO5z766CMGDBhA165dmTlzJosXLy4xnmXLltG+fXs6duwIwPnnn8/8+fN3Pn/aaacBcNhhh+2cqK44r7/+OsOHDweKnq566tSprF27llq1atGrVy9mzJjBxIkT+fDDD2ncuHGJ1y4LlQhEpFxK+uWeSEOHDmXs2LEsXLiQzZs307NnTz7//HMmT57MO++8w1577cWIESOKnX46j5kVuX/EiBHMnj2b7t2788ADDzBv3rwSr1PaPG15U1kXN9V1adfKm676xBNP5LnnnqNPnz7MmTNn53TVzz77LMOHD+fqq6/mvPPOK/H6pVGJQESqhEaNGjFo0CBGjhy5szSwfv16GjZsSNOmTfnmm2/497//XeI1Bg4cyD//+U82b97Mhg0bePrpp3c+t2HDBlq3bs3WrVt3Th0N0LhxYzZs2LDLtTp16kR2djbLly8H4OGHH+bII4/crfcW9XTVKVMieOstuO46+Ne/oJSZY0UkSQ0bNozTTjttZxVR9+7d6dGjB126dKFDhw7079+/xPN79uzJWWedRUZGBmlpaQwYMGDnc7/73e84/PDDSUtLo2vXrju//M8++2wuuugipk6durORGKBevXrMmDGDM888k23bttGrVy9Gjx69W+9r4sSJ/PKXv6Rbt240aNCgwHTVr7zyCjVr1qRz584MHjyYWbNmcdttt1G7dm0aNWpUIQvYpMw01AsWwOGHw8UXwx13JCAwkWpM01BXLZqGuhiZmXDFFXDnnfDGG1FHIyKSPFImEQDcdFNYLWnUKIh19xURSXkJTQRmdryZLTOz5WZ2bTHHDDKzRWa22MxeTWQ8jRqFaqElS+DWWxP5SiLVT1WrRk5Vu/N3SlgiMLOawDRgMNAZGGZmnQsd0wy4AzjZ3bsAZyYqnjwnnABnnw033wxxgwpFpAT16tUjNzdXySDJuTu5ubnUq1evXOclstdQb2C5u38GYGazgFOA+NEe5wBPuvtKAHf/NoHx7DRlCjz/fKgieuUVqJFSFWQi5de2bVtycnJYvXp11KFIKerVq0fbtm3LdU4iE0Eb4Iu4xznA4YWO6QjUNrN5QGPgf919z/tClWKffWDyZLjwQrj//nAvIsWrXbs27du3jzoMSZBE/hYuavhe4XJlLeAw4ETg58ANZtZxlwuZjTKzBWa2oKJ+kYwcCYMGwdVXw9dfV8glRUSqpEQmghxg/7jHbYGvijjmeXf/wd3XAPOB7oUv5O7T3T3T3TNbtmxZIcGZwd13w+bNYRItEZFUlchE8A5wkJm1N7M6wNnAU4WO+RcwwMxqmVkDQtXRxwmMqYCOHeE3v4FHH4Vnn62sVxURSS4JSwTuvg34NfAC4cv9MXdfbGajzWx07JiPgeeBD4C3gXvd/aNExVSUa66BLl3gkktg48bKfGURkeSQMlNMlOStt6B//zDy+C9/qdBLi4gkBU0xUYq+fUOJYOpUeOedqKMREalcSgQxv/897LsvXHQRbN0adTQiIpVHiSCmaVOYNg3ef1/VQyKSWpQI4gwdCqeeChMmwKefRh2NiEjlUCIo5PbboXZtGD0aqlg7uojIblEiKKRNG7jlFpgzB/72t6ijERFJPCWCIoweDf36wZVXgubYEpHqTomgCDVqwPTpsH49XHVV1NGIiCSWEkExunSBcePg4YfhpZeijkZEJHGUCEowfnyYj2j0aNi0KepoREQSQ4mgBPXqhSqizz6D3/426mhERBJDiaAURx4JF1wAf/oTLFoUdTQiIhVPiaAMbr0VmjcP009s3x51NCIiFUuJoAz23jtMSLdgQRhwJiJSnSgRlNEvfgEnnBAWslmxIupoREQqjhJBGZnBHXeE7V/9StNPiEj1oURQDmlpcPPNYVnLxx4r//kzZ0J6ehiwlp4eHouIRE2JoJwuuwwyM+Hyy+H778t+3syZMGpUqFZyD/ejRikZiEj0lAjKqWbNMLYgNzesd1xW48fvOiht06awX0QkSkoEu6FHDxg7Fu69F159tWznrFxZvv0iIpVFiWA3TZwI7duH6p0ffyz9+HbtyrdfRKSyKBHspgYN4K674L//Desdl2bSpHBO4WtMmpSY+EREykqJYA/87Gdw7rlhIZvFi0s+NisrtC2kpYWuqGlp4XFWVuXEKiJSHPMq1iE+MzPTFyxYEHUYO61eDYccEmYpff310DVURCTZmNm77p5Z1HP62tpDLVvCn/8Mb70Fd98ddTQiIuWnRFABhg+HY44JC9l8+WXU0YiIlI8SQQUwCw3HW7eGAWciIlWJEkEFOfDA0KX0n/8MNxGRqkKJoAKNHQvdusGvfw3r1kUdjYhI2SgRVKDateGee2DVKrj++qijEREpGyWCCta7d2gnuPNOePPNqKMRESmdEkEC3HwztG0bpp/YsiXqaERESqZEkACNG8O0aWG08a23Rh2NiEjJlAgS5KST4Mwz4Xe/g2XLoo5GRKR4SgQJNHUq1K8PF1+spS1FJHkpESTQvvvCbbeFNQvuvz/qaEREiqZEkGAXXAADBsD//A98803U0YiI7CqhicDMjjezZWa23MyuLeL5QWa2zswWxW43JjKeKNSoEaab3rQJxoyJOhoRkV0lLBGYWU1gGjAY6AwMM7PORRz6mrtnxG43JSqeKHXqFNYmnjULnnsu6mhERApKZImgN7Dc3T9z9y3ALOCUBL5eUhs3LqxbcMklsHFj1NGIiORLZCJoA3wR9zgntq+wvmb2vpn928y6FHUhMxtlZgvMbMHq1asTEWvC1a0bqohWroQbbog6GhGRfIlMBFbEvsKdKBcCae7eHbgdmF3Uhdx9urtnuntmy5YtKzbKSnTEEaFEMGUKzJgRdTQiIkEiE0EOsH/c47bAV/EHuPt6d98Y234OqG1mLRIYU+T+/Gc47ji48EL4xz+ijkZEJLGJ4B3gIDNrb2Z1gLOBp+IPMLN9zcxi271j8eQmMKbI1asX1ivo1w/OOQeefTbqiEQk1SUsEbj7NuDXwAvAx8Bj7r7YzEab2ejYYWcAH5nZ+8BU4Gz36j8Gt2FDeOYZ6N4dTj8d5s6NOiIRSWVW1b53MzMzfcGCBVGHUSFyc+HIIyE7G156Cfr2jToiEamuzOxdd88s6jmNLI5Q8+YhAbRuDYMHw3vvRR2RiKQiJYKItW4Nc+ZAkybws5/Bxx9HHZGIpBolgiSQlgYvvww1a8Kxx8Jnn0UdkYikEiWCJHHQQaGa6McfQzLIyYk6IhFJFUoESaRrV3j+eVizJiSDb7+NOiIRSQVKBEmmV68wtmDlytBm8P33UUckItWdEkESGjAgDDr7+GM44QTYsCHqiESkOlMiSFI//zk8+ii88w6cfDJs3hx1RCJSXSkRJLGhQ+HBB8NSl2ecAVu2RB2RiFRHSgRJLisL7rorLGhz7rmwbVvUEYlIdVOrLAeZWUNgs7vvMLOOQCfg3+6+NaHRCQCjRoXFbK66Cho0gPvvD0tgiohUhDIlAmA+MMDM9gJeBhYAZwFZiQpMCho7NjQaT5wIjRrB7beDFbXig4hIOZU1EZi7bzKzC4Db3f1WM9PMOJXsxhtDyWDy5JAM/vAHJQMR2XNlTgRm1pdQArignOdKBTGDW28NyeCPf4TGjWH8+KijEpGqrqxf5mOA64B/xtYU6AC8krCopFhmMG0a/PAD/OY3oWRwxRVRRyUiVVmZEoG7vwq8CmBmNYA17n55IgOT4tWoERqMf/gBxowJyeCCC0o9TUSkSGXqe2Jmj5hZk1jvoSXAMjO7OrGhSUlq1YJHHoHjj4eLLoJZs6KOSESqqrJ2Quzs7uuBocBzQDtgeKKCkrKpWxeeeCJMSTF8ODz9dNnOmzkT0tNDySI9PTwWkdRV1kRQ28xqExLBv2LjB6rWGpfVVIMGIQH06AFnnhkWuSnJzJlhXMKKFeAe7keNUjIQSWVlTQR3A9lAQ2C+maUB6xMVlJRPkyZh+uqOHeGUU+CNN4o/dvx42LSp4L5Nm9T7SCSV7fbi9WZWy90rfcKD6rR4fUX75hsYOBC+/hpeeQV69tz1mBo1QkmgMDPYsSPxMYpINPZ48Xoza2pmfzazBbHbnwilA0ki++wTqob22iusZbB48a7HtGtX9LnF7ReR6q+sVUP3AxuAX8Ru64EZiQpKdt/++4dkUKcOHHccfPppwecnTQrtCvEaNAj7RSQ1lTURHODuE9z9s9jtt0CHRAYmu+/AA0My2LIFjjkGvvgi/7msLJg+HdLSQnVQWlp4nKVZo0RSVlkTwWYzOyLvgZn1B7RUShLr3BlefDEsdXnssaH9IE9WFmRnhzaB7GwlAZFUV9ZEMBqYZmbZZpYN/BW4OGFRSYXo2TOsY5CTE6qJvvsu6ohEJBmVKRG4+/vu3h3oBnRz9x7A0QmNTCpE//7wr3/BsmUweLDWPxaRXZVreRN3Xx8bYQwwNgHxSAIceyw8/jgsXAhDhuw6jkBEUtuerHOlmfCrkJNOgocfhtdeg9NPh59+ijoiEUkWe5IINMVEFXP22XDPPWEU8jnnaP1jEQlKnIbazDZQ9Be+AfUTEpEk1AUXhIVtxoyBX/wC7r4bWraMOioRiVKJicDdG1dWIFJ5rrgidB0dNw46dQqrnv3yl2H6CRFJPfqvn6KuvBLefx+6dIELL4RBg2DJkqijEpEoKBGksEMOgXnz4L774KOPICMjLH+5WUMFRVKKEkGKq1EDRo6EpUtDY/KkSdC1K7z0UtSRiUhlUSIQAFq1goceCnMU1agRZi8991z49tuoIxORRFMikAKOOQY++ABuvBEeeyw0Jt97r9YqEKnOEpoIzOx4M1tmZsvN7NoSjutlZtvN7IxExiNlU68e/Pa3ISF07QoXXRQWvClqfQMRqfoSlgjMrCYwDRgMdAaGmVnnYo77I/BComKR3dOpU2hMnjEDPv44NCaPH6/GZJHqJpElgt7A8tj6BVuAWcApRRx3GfAEoNroJGQGI0aExuSsLPj97+HQQ+EFpW2RaiORiaANELckCjmxfTuZWRvgVOCuki5kZqPylslcvXp1hQcqpWvZEh54AObOhVq14PjjwzQVX38ddWQisqcSmQiKmpSu8HQVU4Bx7r69pAu5+3R3z3T3zJaaDyFSRx0V2g4mToQnnghjEaZPV2OySFWWyESQA+wf97gt8FWhYzKBWbHFbs4A7jCzoQmMSSpA3bowYUJICBkZcPHFMGBAGJQmIlVPIhPBO8BBZtbezOoAZwNPxR/g7u3dPd3d04HHgUvdfXYCY5IKdPDBoarowQfDwjc9esB112m9A5GqJmGJwN23Ab8m9Ab6GHjM3Reb2WgzG52o15XKZQbnnRcak4cPh1tuCY3Jzz8fdWQiUlbmXrWWFcjMzPQFCxZEHYYU49VXYfTokBjOOgv+8hdo3TrqqETEzN5198yintPIYqlQRx4JixbBTTfB7NmhMfmuu9SYLJLMlAikwtWtCzfcEBqTDzsMLrkE+veHDz8seNzMmZCeHuY2Sk8Pj0Wk8ikRSMJ07BgmsXvoIVi+PDQmjxsHP/wQvvRHjYIVK8A93I8apWQgEgW1EUilyM0NSeC++8Kv/02bip7ZNC0NsrMrOzqR6k9tBBK55s3DLKbz50P9+sVPb71yZeXGJSJKBFLJBgwIjcnNmhX9fLt2lRmNiIASgUSgTh3461/DdNfxatSAU06BLVuiiUskVSkRSCSyskJVUVpaeNy0abhNnQpt28JVV8GSJdHGKJIqlAgkMllZoWHYHdauhdWr4bnnwiI4t98OXbpA374hYWzYEHW0ItWXEoEkjZo1YfBgePxx+PJL+NOfYP36sEJa69YwciS88UZIHCJScZQIJCm1bAljx4YZTd96C84+G/7xDzjiCOjcGW67Db75JuooRaoHJQJJambQp0+oHlq1KoxD2HtvuOaa0JZw2mnw7LOwbVvUkYpUXUoEUmU0apRfPbRkCYwZE7aHDAmNzuPHw6efRh2lSNWjRCBV0iGHhOqhnBx48skwfcUtt8CBB4ZV1P72N9i8OeooRaoGJQKp0mrXhlNPhWeeCaOSJ02CL74IayO0bg2XXgrvvqsGZpGSKBFItdGmDVx/Pfz3v/DKK3DSSTBjBmRmhhLD7bfDd99FHaVI8lEikGqnRg0YNAgefjg0ME+bFrqmXn457LcfDBsWZkXVGgkigRKBVGvNmuVXD733XhiT8MILcNxxcMABYQGdL76IOkqRaCkRSMrIyAjVQ199BY88EhLBhAmhx9Hxx8Pdd8Mnn6g9QVKP1iOQlPb556Ed4aGHwuI4EMYnHHUUHH10uGlGVKkOSlqPQIlAhFAKWL4c5s7Nv61ZE5474ID8pHDUUbDPPtHGKrI7tDCNSAlmzoT27eHgg+EPf4CTTw7TV3zwAUyZEia/e/TR0Mi8775w6KGh4Xn2bPj++6ijF9lzKhFISstbO3nTpvx9DRrA9OlhdtQ827aFxua5c0PX1NdeC+eYha6peSWGAQPCCGiRZKOqIZFipKfntw3EK23t5C1b4O2386uR3nor7KtVC3r3zk8MffvuugCPSBSUCESKUaNG0b2EzMo3zmDTJnjzzfzE8M474fy6daF///zG5169wmhokcqmRCBSjN0tEZRm3bpQfZSXGN5/P+xv2DAsvJNXYujePQx2E0m0khJBrcoORiSZTJpUdBvBpEl7dt2mTcOsqEOGhMdr1sCrr+YnhquvDvv32iuMgs7rkdS5cyiNiFQmlQgk5c2cGaawXrkyjBmYNKlgQ3EifPVVaHSeOxdefjm/VNKqFfTsCd265d8OPhjq1ElsPFL9qWpIJMl9/nlICvPnh2qkJUtg69bwXO3aYdrtbt1CVVJegthnH5UepOyUCESqmK1bYdmyMJYh/vbll/nHtGxZsOTQrVuoWlIvJSmKEoFINZGbCx9+WDA5fPRR/iI8NWtCx44FSw7duoVpM1R6SG1KBCLV2PbtYXqMwqWH+F5PzZrtWno49NDQi0lSgxKBSApaty6UFgoniI0bw/NmYWnPwgkiPT2Mr5DqRd1HRVJQ06ZhMFv//vn7duwIPZQKJ4cnn8wfWNeoUZhfqX37kBTS0sJ9enroVdWgQQRvRhJKJQKRJBFFN9Y8P/wQeiq9/35IDEuWhISxYkV+76U8rVrtmiDyttPSNNdSslKJQCTJFZ78bsWK8BgqJxk0bBimv+jVq+D+HTvCcp8rVoQ2h+zs/O3334ennoKffip4TvPmuyaI+O0mTRL/fqR8VCIQSQKJmuoi0XbsgG+/LZggCm/n9WjKs9deRSeIvFuzZpX4BlJIZCUCMzse+F+gJnCvu99S6PlTgN8BO4BtwBh3fz2RMYkko5Ury7c/WdSoEdZo2Hdf6NNn1+fdw/QaRSWI5cvhpZdCtVS8Jk3yE8T++4eBc/vsE6qk4u8bNVKX2IqSsERgZjWBacBxQA7wjpk95e5L4g57GXjK3d3MugGPAZ0SFZNIsmrXrugSQVVfJtMsDHxr2XLXaicIieK774ovTbzxRni+KPXr75oc8u4L72veXD2hSpLIEkFvYLm7fwZgZrOAU4CdicDdN8Yd3xCoWvVUIhUkUZPfJTuz8CXdvHmYY6koW7fC6tWhCuqbb8ItbzvvPicHFi4Mj7dt2/UaNWtCixbFly7iE0jLlmH68FSSyETQBvgi7nEOcHjhg8zsVOAPQCvgxKIuZGajgFEA7ar6TySRIuQ1CEfVayiZ1a4N++0XbqXZsSMsH1o4URROHsuXh+34xBuvWbOCSaJFi9C2sdde4bmitps2rbpTiiessdjMzgR+7u4Xxh4PB3q7+2XFHD8QuNHdjy3pumosFpGK8sMPJSeMvH25uSHBFFXaiNekScnJoqTtRM8RFVVjcQ6wf9zjtsBXxR3s7vPN7AAza+HuaxIYl4gIELrNdugQbqVxDyWI77+HtWvDfWnbn3ySv124UbywevVKTxZ9+4ZbRUtkIngHOMjM2gNfAmcD58QfYGYHAp/GGot7AnWA3ATGJCKyW8xC4mjYMEziV15btoSEUNYksmpVGNiXd447XH99FUsE7r7NzH4NvEDoPnq/uy82s9Gx5+8CTgfOM7OtwGbgLK9qAxtERMqgTp3Q3tCqVfnP3bED1q9PXM+nhHaocvfn3L2jux/g7pNi++6KJQHc/Y/u3sXdM9y9r8YQiERr5sz8SefS08NjiV6NGqFqKFGjsjXFhIgA0U9zIdHREAsRAULX1cLdKTdtCvulelMiEBGg6k5zIXtOiUBEgOKns9AYzupPiUBEgDCSufCiM6kwzYUoEYhITFYWTJ8eZv00C/fTp6uhOBWo15CI7JSVpS/+VKQSgYhIilMiEJGko4FtlUtVQyKSVDSwrfKpRCAiSUUD2yqfEoGIJBUNbKt8SgQiklQ0sK3yKRGISFLRwLbKp0QgIklFA9sqnxKBiCSdrCzIzg4LsmRnR5cEUqUbq7qPiogUIZW6sapEICJShFTqxqpEICJShFTqxqpEICJShFTqxqpEICJShFTqxqpEICJShFTqxqpEICJSjFTpxqruoyIiSawyurGqRCAiksQqoxurEoGISBKrjG6sSgQiIkmsMrqxKhGIiCSxyujGqkQgIpLEKqMbq3oNiYgkuaysxHZdVYlARCTFKRGIiKQ4JQIRkRSnRCAikuKUCEREUpy5e9QxlIuZrQZWRB3HHmoBrIk6iCSiz6MgfR759FkUtCefR5q7tyzqiSqXCKoDM1vg7plRx5Es9HkUpM8jnz6LghL1eahqSEQkxSkRiIikOCWCaEyPOoAko8+jIH0e+fRZFJSQz0NtBCIiKU4lAhGRFKdEICKS4pQIKpGZ7W9mr5jZx2a22MyuiDqmqJlZTTN7z8yeiTqWqJlZMzN73MyWxv6N9I06piiZ2ZWx/ycfmdnfzaxe1DFVJjO738y+NbOP4vbtbWYvmdknsfu9KuK1lAgq1zbgKnc/BOgD/MrMOkccU9SuAD6OOogk8b/A8+7eCehOCn8uZtYGuBzIdPdDgZrA2dFGVekeAI4vtO9a4GV3Pwh4OfZ4jykRVCJ3X+XuC2PbGwj/0dtEG1V0zKwtcCJwb9SxRM3MmgADgfsA3H2Lu6+NNKjo1QLqm1ktoAHwVcTxVCp3nw98V2j3KcCDse0HgaEV8VpKBBExs3SgB/CfiEOJ0hTgGmBHxHEkgw7AamBGrKrsXjNrGHVQUXH3L4HJwEpgFbDO3V+MNqqksI+7r4LwwxJoVREXVSKIgJk1Ap4Axrj7+qjjiYKZDQG+dfd3o44lSdQCegJ3unsP4AcqqNhfFcXqvk8B2gP7AQ3N7Nxoo6q+lAgqmZnVJiSBme7+ZNTxRKg/cLKZZQOzgKPN7G/RhhSpHCDH3fNKiI8TEkOqOhb43N1Xu/tW4EmgX8QxJYNvzKw1QOz+24q4qBJBJTIzI9QBf+zuf446nii5+3Xu3tbd0wmNgHPdPWV/8bn718AXZnZwbNcxwJIIQ4raSqCPmTWI/b85hhRuPI/zFHB+bPt84F8VcVEtXl+5+gPDgQ/NbFFs3/Xu/lx0IUkSuQyYaWZ1gM+AX0YcT2Tc/T9m9jiwkNDb7j1SbLoJM/s7MAhoYWY5wATgFuAxM7uAkCzPrJDX0hQTIiKpTVVDIiIpTolARCTFKRGIiKQ4JQIRkRSnRCAikuKUCERizGy7mS2Ku1XYyF4zS4+fRVIkmWgcgUi+ze6eEXUQIpVNJQKRUphZtpn90czejt0OjO1PM7OXzeyD2H272P59zOyfZvZ+7JY3NUJNM7snNsf+i2ZWP3b85Wa2JHadWRG9TUlhSgQi+eoXqho6K+659e7eG/grYdZUYtsPuXs3YCYwNbZ/KvCqu3cnzBe0OLb/IGCau3cB1gKnx/ZfC/SIXWd0Yt6aSPE0slgkxsw2unujIvZnA0e7+2exSQO/dvfmZrYGaO3uW2P7V7l7CzNbDbR195/irpEOvBRbUAQzGwfUdvebzex5YCMwG5jt7hsT/FZFClCJQKRsvJjt4o4pyk9x29vJb6M7EZgGHAa8G1uIRaTSKBGIlM1ZcfdvxbbfJH/5xCzg9dj2y8AlsHNN5ibFXdTMagD7u/srhEV6mgG7lEpEEkm/PETy1Y+bFRbC+sF5XUjrmtl/CD+ehsX2XQ7cb2ZXE1YXy5st9ApgemyGyO2EpLCqmNesCfzNzJoCBvxFS1RKZVMbgUgpYm0Eme6+JupYRBJBVUMiIilOJQIRkRSnEoGISIpTIhARSXFKBCIiKU6JQEQkxSkRiIikuP8HB9CNSZM0SfAAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Plot the loss over time\n",
    "acc = history_dict['binary_accuracy']\n",
    "val_acc = history_dict['val_binary_accuracy']\n",
    "loss = history_dict['loss']\n",
    "val_loss = history_dict['val_loss']\n",
    "\n",
    "epochs = range(1, len(acc) + 1)\n",
    "\n",
    "# \"bo\" is for \"blue dot\"\n",
    "plt.plot(epochs, loss, 'bo', label='Training loss')\n",
    "# b is for \"solid blue line\"\n",
    "plt.plot(epochs, val_loss, 'b', label='Validation loss')\n",
    "plt.title('Training and validation loss')\n",
    "plt.xlabel('Epochs')\n",
    "plt.ylabel('Loss')\n",
    "plt.legend()\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "id": "Z3PJemLPXwz_"
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEWCAYAAAB8LwAVAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAArZElEQVR4nO3deXxU1f3/8deHIEsAQVaVLWhV1CqIKVXcULHiri1+haYq2hZxqZX+rFq1amvp16qt1mq1WNGqtKB1r2gVaqVfl0rYEVwQg0ZQEWSTHT6/P84NmYSbZAKZ3Enm/Xw85jF3n8/cwP3MOefec8zdERERqaxJ0gGIiEh2UoIQEZFYShAiIhJLCUJERGIpQYiISCwlCBERiaUEIWkzsxfM7Py63jZJZlZiZoMycFw3s69F0/eZ2c/T2XYHPqfIzF7a0ThFqmN6DqJxM7M1KbP5wAZgSzR/kbuPq/+osoeZlQA/cPdJdXxcB/Zx9wV1ta2ZFQAfAru4++Y6CVSkGk2TDkAyy91bl01XdzE0s6a66Ei20L/H7KAqphxlZgPNrNTMrjazT4EHzWw3M/uHmS01sy+j6W4p+/zbzH4QTQ83s/8zs9ujbT80s5N2cNteZjbFzFab2SQzu8fMHq0i7nRivNnMXouO95KZdUxZf66ZLTKzZWZ2XTXn5zAz+9TM8lKWnWVms6Pp/mb2hpmtMLMlZna3mTWr4lgPmdmvUuZ/Gu2z2MwurLTtKWY2w8xWmdnHZnZTyuop0fsKM1tjZoeXnduU/QeY2VQzWxm9D0j33NTyPLc3swej7/ClmT2dsu4MM5sZfYcPzGxwtLxCdZ6Z3VT2dzazgqiq7ftm9hHwr2j549HfYWX0b+TAlP1bmtlvo7/nyujfWEsze97MflTp+8w2szPjvqtUTQkit+0OtAd6AiMI/x4ejOZ7AOuAu6vZ/5vAu0BH4FbgATOzHdj2r8BbQAfgJuDcaj4znRi/C1wAdAaaAVcCmNkBwL3R8feMPq8bMdz9TeAr4LhKx/1rNL0FGBV9n8OB44FLqombKIbBUTwnAPsAlds/vgLOA9oBpwAXp1zYjo7e27l7a3d/o9Kx2wPPA3dF3+13wPNm1qHSd9ju3MSo6Tw/QqiyPDA61h1RDP2Bh4GfRt/haKCkis+IcwywP3BiNP8C4Tx1BqYDqVWitwOHAgMI/46vArYCfwG+V7aRmfUBugITaxGHALi7XjnyIvxHHRRNDwQ2Ai2q2b4v8GXK/L8JVVQAw4EFKevyAQd2r822hIvPZiA/Zf2jwKNpfqe4GK9Pmb8EeDGavgEYn7KuVXQOBlVx7F8BY6PpNoSLd88qtr0CeCpl3oGvRdMPAb+KpscCt6Rst2/qtjHHvRO4I5ouiLZtmrJ+OPB/0fS5wFuV9n8DGF7TuanNeQb2IFyId4vZ7k9l8Vb37y+av6ns75zy3faqJoZ20TZtCQlsHdAnZrvmwHJCuw6ERPLHTPyfauwvlSBy21J3X182Y2b5ZvanqMi+ilCl0S61mqWST8sm3H1tNNm6ltvuCSxPWQbwcVUBpxnjpynTa1Ni2jP12O7+FbCsqs8ilBa+bWbNgW8D0919URTHvlG1y6dRHL8mlCZqUiEGYFGl7/dNM3slqtpZCYxM87hlx15Uadkiwq/nMlWdmwpqOM/dCX+zL2N27Q58kGa8cbadGzPLM7NbomqqVZSXRDpGrxZxn+XuG4DHgO+ZWRNgGKHEI7WkBJHbKt/C9v+A/YBvuvuulFdpVFVtVBeWAO3NLD9lWfdqtt+ZGJekHjv6zA5Vbezu8wgX2JOoWL0EoarqHcKv1F2Ba3ckBkIJKtVfgWeB7u7eFrgv5bg13XK4mFAllKoH8EkacVVW3Xn+mPA3axez38fA3lUc8ytC6bHM7jHbpH7H7wJnEKrh2hJKGWUxfAGsr+az/gIUEar+1nql6jhJjxKEpGpDKLaviOqzb8z0B0a/yIuBm8ysmZkdDpyWoRj/DpxqZkdGDcq/pOb/A38FLidcIB+vFMcqYI2Z9QYuTjOGx4DhZnZAlKAqx9+G8Ot8fVSf/92UdUsJVTt7VXHsicC+ZvZdM2tqZucABwD/SDO2ynHEnmd3X0JoG/hj1Ji9i5mVJZAHgAvM7Hgza2JmXaPzAzATGBptXwgMSSOGDYRSXj6hlFYWw1ZCdd3vzGzPqLRxeFTaI0oIW4HfotLDDlOCkFR3Ai0Jv87eBF6sp88tIjT0LiPU+08gXBji3MkOxujubwOXEi76S4AvgdIadvsbob3mX+7+RcryKwkX79XA/VHM6cTwQvQd/gUsiN5TXQL80sxWE9pMHkvZdy0wGnjNwt1Th1U69jLgVMKv/2WERttTK8Wdrjup/jyfC2wilKI+J7TB4O5vERrB7wBWAq9SXqr5OeEX/5fAL6hYIovzMKEE9wkwL4oj1ZXAHGAqoc3hN1S8pj0MHERo05IdoAflJOuY2QTgHXfPeAlGGi8zOw8Y4e5HJh1LQ6UShCTOzL5hZntHVRKDCfXOTyccljRgUfXdJcCYpGNpyJQgJBvsTrgFcw3hHv6L3X1GohFJg2VmJxLaaz6j5mosqYaqmEREJJZKECIiEqtRddbXsWNHLygoSDoMEZEGY9q0aV+4e6e4dY0qQRQUFFBcXJx0GCIiDYaZVX76fhtVMYmISCwlCBERiaUEISIisZQgREQklhKEiIjEUoIQEWmgxo2DggJo0iS8jxtX0x6106hucxURyRXjxsGIEbA2Gmpr0aIwD1BUVDefoRKEiEgDdN115cmhzNq1YXldUYIQEWmAPvqodst3hBKEiEgD1KPyYLU1LN8RShAiIrWU6cbhdIweDfn5FZfl54fldUUJQkSkFsoahxctAvfyxuH6ThJFRTBmDPTsCWbhfcyYumughkY2HkRhYaGrsz4RyaSCgpAUKuvZE0pK6juanWdm09y9MG6dShAiIrVQH43D2UIJQkSkFuqjcThbKEGIiNRCfTQOZwslCBFpUJK+g6g+GoezhbraEJEGoz66l0hHUVHjTAiVqQQhIg1GfXQvIeUymiDMbLCZvWtmC8zsmpj1u5nZU2Y228zeMrOvp7uviOSeXLqDKBtkLEGYWR5wD3AScAAwzMwOqLTZtcBMdz8YOA/4fS32FZEck0t3EGWDTJYg+gML3H2hu28ExgNnVNrmAGAygLu/AxSYWZc09xWRHJNLdxBlg0wmiK7AxynzpdGyVLOAbwOYWX+gJ9AtzX2J9hthZsVmVrx06dI6Cl1EslEu3UGUDTKZICxmWeV+PW4BdjOzmcCPgBnA5jT3DQvdx7h7obsXdurUaSfCFZHqJH17aZmiotClxdat4V3JIXMyeZtrKdA9Zb4bsDh1A3dfBVwAYGYGfBi98mvaV0TqT7bcXir1K5MliKnAPmbWy8yaAUOBZ1M3MLN20TqAHwBToqRR474iUn90e2luylgJwt03m9llwD+BPGCsu79tZiOj9fcB+wMPm9kWYB7w/er2zVSsIlI93V6am9Tdt4jUqLF1cS3l1N23iOwU3V6am5QgRLJcNtw9pNtLc5M66xPJYtl091CudFAn5VSCEMliuntIkqQEIZLFdPeQJEkJQiSLqXM6SZIShEgW091DkiQlCJEspruHJEm6i0kky+nuIUmKShAiIhJLCUKkCtnwgJpIklTFJBIjmx5QE0mKShAiMfSAmohKECKx9ICaZMrWreG1ZUv5K3V+R9bl5cE3vlH3sSpBiMTo0SO+e2s9oNZwucOmTfDVV1W/1q6t3fp162p/Mc+ELl3g00/r/rhKECIxRo+u2AYBekAtCZs2wZdfhtfy5eFVNr9mTe0v8LW9QDdvDq1ahb99q1blrw4dwo+Fli3Dr/eyV5MmVc/v6Lp0tm3ZMjPnXwlCJEZZQ/R114VqpR49QnJQA3XtuYeLednFPfVCX930l1/C6tXVH7tJk4oX7tSLeceOVa+Le1Vel58PTXP8CqkR5UQkLevXw8qV8b/oa7rgb95c9XGbNYP27cNrt93ipyvPt2sHrVuHX/hm9XYKGqXqRpTL8fwo0vi5l1/cd+a1cWP1n9O2bcWLePfuVV/wU6dbttRFPlspQYhkua1b4YsvYNmyHbuwr1oV6vJr0qZNuMiXvTp3hn32gV13rbg87mLfrp2qYxoj/UlFErR+PSxeDJ98sv2rtDS8L15c/QXebPuL+J57wv77h+nK6+JebdqEBk+RVEoQIhngHure4y74qa8vvth+3/x86No1vI46qny6Y8f4i3vr1qGxVqSuKUFI1hk3LrvvHtq0KdxzHnfBL0sEixeHe+Qr69w5XOy7d4fDDiu/+HfrVj7dtq3q5CU7KEFIVsmGPpDcw4X+7bdh7lxYuLBiMvjss7BNqmbNyi/whYUVL/hlrz32CHfdiDQUus1VskpBQfwTzD17QklJ3X6WO3z+eXkiSH1fubJ8u3btyi/4cRf+bt3Cg1P61S8NkW5zlQYjU30gLV8enwhS2wDat4cDD4Tvfje8f/3r4b1jx537bJGGSglCssrO9oG0ahXMm7d9IliypHybNm3Chf/MMysmgt13VylAJJUShGSVdPtAWrsW5s/fPhGkljRatgwX/m99qzwJfP3roUpIiUCkZkoQklUq94HUvTuMHBku6NddV54IFi4sbyhu1izc83/kkRUTQdlocCKyY5QgJKts3Rou7KedBpMmwfvvw7XXhnVNm8K++0K/fnDeeeWJYO+99RSvSCbov5Ukzh1mzIDx42HChFByaNECjj8ehgwpTwT77htKCyJSP5QgJDHz54ekMH48vPdeKAWceGJobzj99NBFhIgkRwlC6tWHH4ZSwt/+BrNnh7aFY4+FK6+Eb387PE8gItlBCUIybvFiePzxkBT++9+w7PDD4fe/h7PPDk8Yi0j2UYKQjFi2DJ54IiSFV18N7Qx9+8Itt8A554SGaBHJbkoQUmdWrYJnnglJ4eWXwyhi++4LN9wQksL++ycdoYjURkYThJkNBn4P5AF/dvdbKq1vCzwK9Ihiud3dH4zWlQCrgS3A5qr6CpFkrVsHzz8fksLzz8OGDeGp55/8BIYODaUGPZQm0jBlLEGYWR5wD3ACUApMNbNn3X1eymaXAvPc/TQz6wS8a2bj3L1scMNj3T2mx3xJ0saN8NJL4e6jZ54JA9J36RKegB42LHRjraQg0vBlsgTRH1jg7gsBzGw8cAaQmiAcaGNmBrQGlgPVDG8uSdmyBf7975AUnngiDIaz226hlDBsGBxzjEYkE2lsMpkgugIfp8yXAt+stM3dwLPAYqANcI67b43WOfCSmTnwJ3cfE/chZjYCGAHQI90e3SQt7vDGGyEpPPZYGAehdWs444yQFE44QQ+uiTRmmUwQcZUMlQefOBGYCRwH7A28bGb/cfdVwBHuvtjMOkfL33H3KdsdMCSOMRDGg6jLL5BrykZyW7QoPKTWrFnoDrt5czjllJAUTj45dJ4nIo1fJhNEKdA9Zb4boaSQ6gLgFg+jFi0wsw+B3sBb7r4YwN0/N7OnCFVW2yUIqRvjxsEPf1g+TOaqVaGju5Ej4Te/0VPNIrkok31dTgX2MbNeZtYMGEqoTkr1EXA8gJl1AfYDFppZKzNrEy1vBXwLmJvBWHPeqFHbj6G8dSu88IKSg0iuylgJwt03m9llwD8Jt7mOdfe3zWxktP4+4GbgITObQ6iSutrdvzCzvYCnQts1TYG/uvuLmYo1l61YAVdcAUuXxq/f2ZHcRKThyuhzEO4+EZhYadl9KdOLCaWDyvstBPpkMjaBF1+EH/wAPv0U2ratOA5zGbX7i+QuDaeSg1atCs8snHRSSAxvvgn33LN943PcSG4ikjuUIHLM5Mlw0EHwwANw9dUwbRoUFoaR3MaMgZ49w0NuPXuG+bIR3kQk96gvphyxZk1ICH/8I+y3H7z2WnjiOVVRkRKCiJRTCSIHTJkCffrAvfeGPpJmzNg+OYiIVKYE0YitXRvuUBo4MFQbvfoq/Pa30LJl0pGJSEOgKqZG6vXXYfhweP99uOyyMA5Dq1ZJRyUiDYlKEI3M+vVw1VVw1FGh19XJk+EPf1ByEJHaUwmiEZk6Fc4/H+bPD7ex3n47tGmTdFQi0lCpBNEIbNgA118fxnletSo8APenPyk5iMjOUQmigZsxI5Qa5swJbQ533AHt2iUdlYg0BipBNFCbNsEvfgH9+4d+lJ57Dh58UMlBROqOShAN0Jw5odQwY0Z4sO2uu6B9+6SjEpHGRiWIBmTzZvjf/4VDD4XSUnjySXj0USUHEckMlSAaiPnzQxvDW2/B2WeHzvU6dUo6KhFpzGosQZjZqWamkkZCtmwJTz8fcggsWFA+PrSSg4hkWjoX/qHA+2Z2q5ntn+mApNz778Mxx8CVV8LgwfD223DOOUlHJSK5osYE4e7fAw4BPgAeNLM3zGxE2ZCgUve2bg0Nz336hKTwyCPw1FOw++5JRyYiuSStqiN3XwU8AYwH9gDOAqab2Y8yGFtOWrgQjjsOfvzj0Mne3Lnwve+FzvZEROpTOm0Qp5nZU8C/gF2A/u5+EmFI0CszHF/OcA9PPx98MEyfHgb0ef556No16chEJFelcxfT2cAd7j4ldaG7rzWzCzMTVu4ZOxZGjoRBg0Jy0FjQIpK0dBLEjcCSshkzawl0cfcSd5+cschyyOzZoUvuQYNCP0p5eUlHJCKSXhvE48DWlPkt0TKpA6tWwZAhsNtuMG6ckoOIZI90ShBN3X1j2Yy7bzSzZhmMKWe4h265P/gAXnkFOndOOiIRkXLplCCWmtnpZTNmdgbwReZCyh333gsTJsDo0XD00UlHIyJSUToliJHAODO7GzDgY+C8jEaVA6ZNg1Gj4OSTwwhwIiLZpsYE4e4fAIeZWWvA3H115sNq3FasCP0pdekCDz8MTdSRiYhkobQ66zOzU4ADgRYWPbHl7r/MYFyNljtccAF8/DFMmQIdOiQdkYhIvBoThJndB+QDxwJ/BoYAb2U4rkbrzjvh6afhd78LQ4SKiGSrdCo3Brj7ecCX7v4L4HCge2bDapzefDO0N5x5JlxxRdLRiIhUL50EsT56X2tmewKbgF6ZC6lxWrYM/ud/oHv3MDSo+lYSkWyXThvEc2bWDrgNmA44cH8mg2pstm6F886Dzz6D117TuNEi0jBUmyCigYImu/sK4Akz+wfQwt1X1kdwjcVtt8HEiWEUuMLCpKMREUlPtVVM7r4V+G3K/AYlh9qZMgWuuy4M9HPxxUlHIyKSvnTaIF4ys++Yqda8tj7/HIYOhb32gjFj1O4gIg1LOm0QPwFaAZvNbD3haWp3910zGlkDt2ULFBXBl1/CCy/ArjpbItLApPMktYYW3QGjR8OkSXD//WHoUBGRhiadEeWOjnulc3AzG2xm75rZAjO7JmZ9WzN7zsxmmdnbZnZBuvtms8mT4aab4Nxz4fvfT2+fceOgoCB0u1FQEOZFRJJk7l79BmbPpcy2APoD09z9uBr2ywPeA04ASoGpwDB3n5eyzbVAW3e/2sw6Ae8CuxPGnKh23ziFhYVeXFxc7ffJtCVLoG9f6NgR3noLWrWqeZ9x40K332vXli/Lzw/tFkVFGQtVRAQzm+busfdX1liCcPfTUl4nAF8HPkvjc/sDC9x9YTSexHjgjMqHB9pEDeCtgeXA5jT3zTqbN8OwYbBmDTz+eHrJAcJdTqnJAcL8ddfVfYwiIunakX5ESwlJoiZdCV2Dp+7XtdI2dwP7A4uBOcCPo1tr09kXADMbYWbFZla8dOnS9L5Bhtx4I7z6Ktx3HxxwQPr7ffRR7ZaLiNSHdDrr+wPhlz6EhNIXmJXGseNu6qxcn3UiMBM4DtgbeNnM/pPmvmGh+xhgDIQqpjTiyogXXoBf/xp+8IPQ9lAbPXrAokXxy0VEkpJOCaIYmBa93gCudvfvpbFfKRU79etGKCmkugB40oMFwIdA7zT3zRoffxySwsEHw1131X7/0aNDm0Oq/PywXEQkKek8B/F3YL27b4HQ+Gxm+e6+tob9pgL7mFkv4BNgKPDdStt8BBwP/MfMugD7AQuBFWnsmxU2bQpPSW/YENodWras/THKGqKvuy5UK/XoEZKDGqhFJEnpJIjJwCBgTTTfEngJGFDdTu6+2cwuA/4J5AFj3f1tMxsZrb8PuBl4yMzmEKqVrnb3LwDi9q3tl6sPP/sZvPEGjB8P++6748cpKlJCEJHskk6CaOHuZckBd19jZvnV7ZCy7URgYqVl96VMLwa+le6+2eaZZ+C3v4VLLw2lCBGRxiSdNoivzKxf2YyZHQqsy1xIDcOHH8Lw4XDooSFJiIg0NumUIK4AHjezskbiPYCc/r28YUMY/McdHnsMmjdPOiIRkbqXTl9MU82sN6EB2YB33H1TxiPLYj/9KRQXw1NPhZ5aRUQao3T6YroUaOXuc919DtDazC7JfGjZ6fHH4Q9/gFGjwtjSIiKNVTptED+MRpQDwN2/BH6YsYiy2Pvvh873DjsMbrkl6WhERDIrnQTRJHWwoKgTvmaZCyk7rVsHZ58Nu+wCEyZAs5w7AyKSa9JppP4n8JiZ3Ufo7mIk8EJGo8pCV1wBs2bB88+rCwwRyQ3pJIirgRHAxYRG6hmEO5lyxqOPhq63r7kGTj456WhEROpHOt19bwXeJHSBUUjoGmN+huPKGvPnw0UXwVFHwc03Jx2NiEj9qbIEYWb7EvpAGgYsAyYAuPux9RNa8r76KrQ7tGoVutJomk55S0SkkajukvcO8B/gtKinVcxsVL1ElQXc4ZJLYN48eOkl2HPPpCMSEalf1VUxfQf4FHjFzO43s+OJH6ehUXrwQXj4YbjhBhg0KOloRETqX5UJwt2fcvdzCOMz/BsYBXQxs3vNLLaDvcZi9uzQAd/xx8PPf550NCIiyUinkfordx/n7qcSBu6ZCVyT6cCSsnp1aHdo1w7GjYO8vKQjEhFJRq2aXd19OfCn6NXouMOIEbBgAfzrX9ClS9IRiYgkR/flpLjvvnC30q9/Dccck3Q0IiLJSqerjZwwfXp4Wvqkk+Dqq5OORkQkeUoQwIoVod2hc+dw51ITnRUREVUxucOFF8JHH8Grr0LHjklHJCKSHXL+t/KXX8LChaH77gEDko5GRCR75HwJon17ePNNDRsqIlJZzicIgBYtko5ARCT75HwVk4iIxFOCEBGRWEoQIiISSwlCRERiKUGIiEgsJQgREYmlBCEiIrGUIEREJJYShIiIxFKCEBGRWEoQIiISSwlCRERiKUGIiEgsJQgREYmV0QRhZoPN7F0zW2Bm18Ss/6mZzYxec81si5m1j9aVmNmcaF1xJuMUEZHtZWw8CDPLA+4BTgBKgalm9qy7zyvbxt1vA26Ltj8NGOXuy1MOc6y7f5GpGEVEpGqZLEH0Bxa4+0J33wiMB86oZvthwN8yGI+IiNRCJhNEV+DjlPnSaNl2zCwfGAw8kbLYgZfMbJqZjajqQ8xshJkVm1nx0qVL6yBsERGBzCYIi1nmVWx7GvBapeqlI9y9H3AScKmZHR23o7uPcfdCdy/s1KnTzkUsIiLbZDJBlALdU+a7AYur2HYolaqX3H1x9P458BShykpEROpJJhPEVGAfM+tlZs0ISeDZyhuZWVvgGOCZlGWtzKxN2TTwLWBuBmMVEZFKMnYXk7tvNrPLgH8CecBYd3/bzEZG6++LNj0LeMndv0rZvQvwlJmVxfhXd38xU7GKiMj2zL2qZoGGp7Cw0IuL9ciEiEi6zGyauxfGrdOT1CIiEksJQkREYilBiIhILCUIERGJpQQhIiKxlCBERCSWEoSIiMRSghARkVhKECIiEksJQkREYilBiIhILCUIERGJpQQhIiKxlCBERCSWEoSIiMTK2IBBIpJbNm3aRGlpKevXr086FInRokULunXrxi677JL2PkoQIlInSktLadOmDQUFBUSjQUqWcHeWLVtGaWkpvXr1Sns/VTGJSJ1Yv349HTp0UHLIQmZGhw4dal26U4IQkTqj5JC9duRvowQhIiKxlCBEJBHjxkFBATRpEt7HjdvxYy1btoy+ffvSt29fdt99d7p27bptfuPGjdXuW1xczOWXX17jZwwYMGDHA2yg1EgtIvVu3DgYMQLWrg3zixaFeYCiotofr0OHDsycOROAm266idatW3PllVduW79582aaNo2/3BUWFlJYWFjjZ7z++uu1D6yBUwlCROrdddeVJ4cya9eG5XVl+PDh/OQnP+HYY4/l6quv5q233mLAgAEccsghDBgwgHfffReAf//735x66qlASC4XXnghAwcOZK+99uKuu+7adrzWrVtv237gwIEMGTKE3r17U1RUhLsDMHHiRHr37s2RRx7J5Zdfvu24qUpKSjjqqKPo168f/fr1q5B4br31Vg466CD69OnDNddcA8CCBQsYNGgQffr0oV+/fnzwwQd1d5JqoBKEiNS7jz6q3fId9d577zFp0iTy8vJYtWoVU6ZMoWnTpkyaNIlrr72WJ554Yrt93nnnHV555RVWr17Nfvvtx8UXX7zdswMzZszg7bffZs899+SII47gtddeo7CwkIsuuogpU6bQq1cvhg0bFhtT586defnll2nRogXvv/8+w4YNo7i4mBdeeIGnn36a//73v+Tn57N8+XIAioqKuOaaazjrrLNYv349W7durduTVA0lCBGpdz16hGqluOV16eyzzyYvLw+AlStXcv755/P+++9jZmzatCl2n1NOOYXmzZvTvHlzOnfuzGeffUa3bt0qbNO/f/9ty/r27UtJSQmtW7dmr7322vacwbBhwxgzZsx2x9+0aROXXXYZM2fOJC8vj/feew+ASZMmccEFF5Cfnw9A+/btWb16NZ988glnnXUWEB52q0+qYhKRejd6NETXwW3y88PyutSqVatt0z//+c859thjmTt3Ls8991yVzwQ0b95823ReXh6bN29Oa5uyaqaa3HHHHXTp0oVZs2ZRXFy8rRHd3be7FTXdY2aKEoSI1LuiIhgzBnr2BLPwPmbMjjVQp2vlypV07doVgIceeqjOj9+7d28WLlxISUkJABMmTKgyjj322IMmTZrwyCOPsGXLFgC+9a1vMXbsWNZGjTPLly9n1113pVu3bjz99NMAbNiwYdv6+qAEISKJKCqCkhLYujW8ZzI5AFx11VX87Gc/44gjjth2Ua5LLVu25I9//CODBw/myCOPpEuXLrRt23a77S655BL+8pe/cNhhh/Hee+9tK+UMHjyY008/ncLCQvr27cvtt98OwCOPPMJdd93FwQcfzIABA/j000/rPPaqWNJFmLpUWFjoxcXFSYchkpPmz5/P/vvvn3QYiVqzZg2tW7fG3bn00kvZZ599GDVqVNJhbRP3NzKzae4ee5+vShAiInXk/vvvp2/fvhx44IGsXLmSiy66KOmQdoruYhIRqSOjRo3KqhLDzlIJQkREYilBiIhILCUIERGJpQQhIiKxlCBEpMEbOHAg//znPyssu/POO7nkkkuq3afstviTTz6ZFStWbLfNTTfdtO15hKo8/fTTzJs3b9v8DTfcwKRJk2oRffbKaIIws8Fm9q6ZLTCza2LW/9TMZkavuWa2xczap7OviEiZYcOGMX78+ArLxo8fX2WHeZVNnDiRdu3a7dBnV04Qv/zlLxk0aNAOHSvbZOw2VzPLA+4BTgBKgalm9qy7bzuT7n4bcFu0/WnAKHdfns6+IpK9rrgCouEZ6kzfvnDnnfHrhgwZwvXXX8+GDRto3rw5JSUlLF68mCOPPJKLL76YqVOnsm7dOoYMGcIvfvGL7fYvKCiguLiYjh07Mnr0aB5++GG6d+9Op06dOPTQQ4HwjMOYMWPYuHEjX/va13jkkUeYOXMmzz77LK+++iq/+tWveOKJJ7j55ps59dRTGTJkCJMnT+bKK69k8+bNfOMb3+Dee++lefPmFBQUcP755/Pcc8+xadMmHn/8cXr37l0hppKSEs4991y++uorAO6+++5tgxbdeuutPPLIIzRp0oSTTjqJW265hQULFjBy5EiWLl1KXl4ejz/+OHvvvfdOnfNMliD6AwvcfaG7bwTGA2dUs/0w4G87uK+I5LAOHTrQv39/XnzxRSCUHs455xzMjNGjR1NcXMzs2bN59dVXmT17dpXHmTZtGuPHj2fGjBk8+eSTTJ06ddu6b3/720ydOpVZs2ax//7788ADDzBgwABOP/10brvtNmbOnFnhgrx+/XqGDx/OhAkTmDNnDps3b+bee+/dtr5jx45Mnz6diy++OLYaq6xb8OnTpzNhwoRto96ldgs+a9YsrrrqKiB0C37ppZcya9YsXn/9dfbYY4+dO6lk9kG5rsDHKfOlwDfjNjSzfGAwcNkO7DsCGAHQo677ChaRHVLVL/1MKqtmOuOMMxg/fjxjx44F4LHHHmPMmDFs3ryZJUuWMG/ePA4++ODYY/znP//hrLPO2tbl9umnn75t3dy5c7n++utZsWIFa9as4cQTT6w2nnfffZdevXqx7777AnD++edzzz33cMUVVwAh4QAceuihPPnkk9vtnw3dgmeyBGExy6rq+Ok04DV3X17bfd19jLsXunthp06dah1kXY6LKyLJOfPMM5k8eTLTp09n3bp19OvXjw8//JDbb7+dyZMnM3v2bE455ZQqu/kuU7nL7TLDhw/n7rvvZs6cOdx44401Hqemfu7KugyvqkvxbOgWPJMJohTonjLfDVhcxbZDKa9equ2+O6xsXNxFi8C9fFxcJQmRhqd169YMHDiQCy+8cFvj9KpVq2jVqhVt27bls88+44UXXqj2GEcffTRPPfUU69atY/Xq1Tz33HPb1q1evZo99tiDTZs2MS7lItGmTRtWr1693bF69+5NSUkJCxYsAEKvrMccc0za3ycbugXPZIKYCuxjZr3MrBkhCTxbeSMzawscAzxT2313Vn2Miysi9WfYsGHMmjWLoUOHAtCnTx8OOeQQDjzwQC688EKOOOKIavfv168f55xzDn379uU73/kORx111LZ1N998M9/85jc54YQTKjQoDx06lNtuu41DDjmkwnjRLVq04MEHH+Tss8/moIMOokmTJowcOTLt75IN3YJntLtvMzsZuBPIA8a6+2gzGwng7vdF2wwHBrv70Jr2renzatvdd5MmoeSwfdyhj3oRSZ+6+85+te3uO6O9ubr7RGBipWX3VZp/CHgonX3rWn2Niysi0hDl9JPU9TUurohIQ5TTCSKJcXFFGrPGNEJlY7Mjf5ucHzCoqEgJQaQutGjRgmXLltGhQ4cqbxWVZLg7y5Ytq/XzETmfIESkbnTr1o3S0lKWLl2adCgSo0WLFnTr1q1W+yhBiEid2GWXXejVq1fSYUgdyuk2CBERqZoShIiIxFKCEBGRWBl9krq+mdlSIObRtwalI/BF0kFkCZ2LinQ+KtL5KLcz56Knu8f2dNqoEkRjYGbFVT32nmt0LirS+ahI56Ncps6FqphERCSWEoSIiMRSgsg+Y5IOIIvoXFSk81GRzke5jJwLtUGIiEgslSBERCSWEoSIiMRSgsgCZtbdzF4xs/lm9raZ/TjpmJJmZnlmNsPM/pF0LEkzs3Zm9nczeyf6N3J40jElycxGRf9P5prZ38ysdl2UNnBmNtbMPjezuSnL2pvZy2b2fvS+W118lhJEdtgM/D933x84DLjUzA5IOKak/RiYn3QQWeL3wIvu3hvoQw6fFzPrClwOFLr71wlDEg+tfq9G5yFgcKVl1wCT3X0fYHI0v9OUILKAuy9x9+nR9GrCBaBrslElx8y6AacAf046lqSZ2a7A0cADAO6+0d1XJBpU8poCLc2sKZAPLE44nnrl7lOA5ZUWnwH8JZr+C3BmXXyWEkSWMbMC4BDgvwmHkqQ7gauArQnHkQ32ApYCD0ZVbn82s1ZJB5UUd/8EuB34CFgCrHT3l5KNKit0cfclEH5wAp3r4qBKEFnEzFoDTwBXuPuqpONJgpmdCnzu7tOSjiVLNAX6Afe6+yHAV9RR9UFDFNWtnwH0AvYEWpnZ95KNqvFSgsgSZrYLITmMc/cnk44nQUcAp5tZCTAeOM7MHk02pESVAqXuXlai/DshYeSqQcCH7r7U3TcBTwIDEo4pG3xmZnsARO+f18VBlSCygIUBfB8A5rv775KOJ0nu/jN37+buBYTGx3+5e87+QnT3T4GPzWy/aNHxwLwEQ0raR8BhZpYf/b85nhxutE/xLHB+NH0+8ExdHFRDjmaHI4BzgTlmNjNadq27T0wuJMkiPwLGmVkzYCFwQcLxJMbd/2tmfwemE+7+m0GOdblhZn8DBgIdzawUuBG4BXjMzL5PSKJn18lnqasNERGJoyomERGJpQQhIiKxlCBERCSWEoSIiMRSghARkVhKECI1MLMtZjYz5VVnTzKbWUFqr5wi2UTPQYjUbJ279006CJH6phKEyA4ysxIz+42ZvRW9vhYt72lmk81sdvTeI1rexcyeMrNZ0ausi4g8M7s/GuPgJTNrGW1/uZnNi44zPqGvKTlMCUKkZi0rVTGdk7Julbv3B+4m9EJLNP2wux8MjAPuipbfBbzq7n0I/Sm9HS3fB7jH3Q8EVgDfiZZfAxwSHWdkZr6aSNX0JLVIDcxsjbu3jlleAhzn7gujzhY/dfcOZvYFsIe7b4qWL3H3jma2FOjm7htSjlEAvBwN9IKZXQ3s4u6/MrMXgTXA08DT7r4mw19VpAKVIER2jlcxXdU2cTakTG+hvG3wFOAe4FBgWjRAjki9UYIQ2TnnpLy/EU2/TvkwmEXA/0XTk4GLYduY27tWdVAzawJ0d/dXCIMntQO2K8WIZJJ+kYjUrGVKL7sQxocuu9W1uZn9l/Bja1i07HJgrJn9lDAaXFnvqz8GxkQ9bm4hJIslVXxmHvCombUFDLhDQ41KfVMbhMgOitogCt39i6RjEckEVTGJiEgslSBERCSWShAiIhJLCUJERGIpQYiISCwlCBERiaUEISIisf4/WATuQ1oRPzUAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Plot the accuracy over time\n",
    "plt.plot(epochs, acc, 'bo', label='Training acc')\n",
    "plt.plot(epochs, val_acc, 'b', label='Validation acc')\n",
    "plt.title('Training and validation accuracy')\n",
    "plt.xlabel('Epochs')\n",
    "plt.ylabel('Accuracy')\n",
    "plt.legend(loc='lower right')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hFFyCuJoXy7r"
   },
   "source": [
    "In this plot, the dots represent the training loss and accuracy, and the solid lines are the validation loss and accuracy.\n",
    "\n",
    "Notice the training loss *decreases* with each epoch and the training accuracy *increases* with each epoch. This is expected when using a gradient descent optimization—it should minimize the desired quantity on every iteration.\n",
    "\n",
    "This isn't the case for the validation loss and accuracy—they seem to peak before the training accuracy. This is an example of overfitting: the model performs better on the training data than it does on data it has never seen before. After this point, the model over-optimizes and learns representations *specific* to the training data that do not *generalize* to test data.\n",
    "\n",
    "For this particular case, you could prevent overfitting by simply stopping the training when the validation accuracy is no longer increasing. One way to do so is to use the `tf.keras.callbacks.EarlyStopping` callback."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "-to23J3Vy5d3"
   },
   "source": [
    "## Export the model\n",
    "\n",
    "In the code above, you applied the `TextVectorization` layer to the dataset before feeding text to the model. If you want to make your model capable of processing raw strings (for example, to simplify deploying it), you can include the `TextVectorization` layer inside your model. To do so, you can create a new model using the weights you just trained."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "id": "FWXsMvryuZuq"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "782/782 [==============================] - 3s 3ms/step - loss: 0.3105 - accuracy: 0.8735\n",
      "0.873520016670227\n"
     ]
    }
   ],
   "source": [
    "# Export the model\n",
    "export_model = tf.keras.Sequential([\n",
    "  vectorize_layer,\n",
    "  model,\n",
    "  layers.Activation('sigmoid')\n",
    "])\n",
    "\n",
    "# TODO 5 -- Your code goes here(\n",
    "    loss=losses.BinaryCrossentropy(from_logits=False), optimizer=\"adam\", metrics=['accuracy']\n",
    ")\n",
    "\n",
    "# Test it with `raw_test_ds`, which yields raw strings\n",
    "loss, accuracy = export_model.evaluate(raw_test_ds)\n",
    "print(accuracy)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "TwQgoN88LoEF"
   },
   "source": [
    "### Inference on new data\n",
    "\n",
    "To get predictions for new examples, you can simply call `model.predict()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "id": "QW355HH5L49K"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.6076096 ],\n",
       "       [0.42835376],\n",
       "       [0.34670413]], dtype=float32)"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "examples = [\n",
    "  \"The movie was great!\",\n",
    "  \"The movie was okay.\",\n",
    "  \"The movie was terrible...\"\n",
    "]\n",
    "\n",
    "export_model.predict(examples)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "MaxlpFWpzR6c"
   },
   "source": [
    "Including the text preprocessing logic inside your model enables you to export a model for production that simplifies deployment, and reduces the potential for [train/test skew](https://developers.google.com/machine-learning/guides/rules-of-ml#training-serving_skew).\n",
    "\n",
    "There is a performance difference to keep in mind when choosing where to apply your TextVectorization layer. Using it outside of your model enables you to do asynchronous CPU processing and buffering of your data when training on GPU. So, if you're training your model on the GPU, you probably want to go with this option to get the best performance while developing your model, then switch to including the TextVectorization layer inside your model when you're ready to prepare for deployment.\n",
    "\n",
    "Visit this [tutorial](https://www.tensorflow.org/tutorials/keras/save_and_load) to learn more about saving models."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "eSSuci_6nCEG"
   },
   "source": [
    "## Exercise: multi-class classification on Stack Overflow questions\n",
    "\n",
    "This tutorial showed how to train a binary classifier from scratch on the IMDB dataset. As an exercise, you can modify this notebook to train a multi-class classifier to predict the tag of a programming question on [Stack Overflow](http://stackoverflow.com/).\n",
    "\n",
    "A [dataset](https://storage.googleapis.com/download.tensorflow.org/data/stack_overflow_16k.tar.gz) has been prepared for you to use containing the body of several thousand programming questions (for example, \"How can I sort a dictionary by value in Python?\") posted to Stack Overflow. Each of these is labeled with exactly one tag (either Python, CSharp, JavaScript, or Java). Your task is to take a question as input, and predict the appropriate tag, in this case, Python. \n",
    "\n",
    "The dataset you will work with contains several thousand questions extracted from the much larger public Stack Overflow dataset on [BigQuery](https://console.cloud.google.com/marketplace/details/stack-exchange/stack-overflow), which contains more than 17 million posts.\n",
    "\n",
    "After downloading the dataset, you will find it has a similar directory structure to the IMDB dataset you worked with previously:\n",
    "\n",
    "```\n",
    "train/\n",
    "...python/\n",
    "......0.txt\n",
    "......1.txt\n",
    "...javascript/\n",
    "......0.txt\n",
    "......1.txt\n",
    "...csharp/\n",
    "......0.txt\n",
    "......1.txt\n",
    "...java/\n",
    "......0.txt\n",
    "......1.txt\n",
    "```\n",
    "\n",
    "Note: To increase the difficulty of the classification problem, occurrences of the words Python, CSharp, JavaScript, or Java in the programming questions have been replaced with the word *blank* (as many questions contain the language they're about).\n",
    "\n",
    "To complete this exercise, you should modify this notebook to work with the Stack Overflow dataset by making the following modifications:\n",
    "\n",
    "1. At the top of your notebook, update the code that downloads the IMDB dataset with code to download the [Stack Overflow dataset](https://storage.googleapis.com/download.tensorflow.org/data/stack_overflow_16k.tar.gz) that has already been prepared. As the Stack Overflow dataset has a similar directory structure, you will not need to make many modifications.\n",
    "\n",
    "1. Modify the last layer of your model to `Dense(4)`, as there are now four output classes.\n",
    "\n",
    "1. When compiling the model, change the loss to `tf.keras.losses.SparseCategoricalCrossentropy`. This is the correct loss function to use for a multi-class classification problem, when the labels for each class are integers (in this case, they can be 0, *1*, *2*, or *3*). In addition, change the metrics to `metrics=['accuracy']`, since this is a multi-class classification problem (`tf.metrics.BinaryAccuracy` is only used for binary classifiers).\n",
    "\n",
    "1. When plotting accuracy over time, change `binary_accuracy` and `val_binary_accuracy` to `accuracy` and `val_accuracy`, respectively.\n",
    "\n",
    "1. Once these changes are complete, you will be able to train a multi-class classifier. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "F0T5SIwSm7uc"
   },
   "source": [
    "## Learning more\n",
    "\n",
    "This tutorial introduced text classification from scratch. To learn more about the text classification workflow in general, check out the [Text classification guide](https://developers.google.com/machine-learning/guides/text-classification/) from Google Developers.\n"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "text_classification.ipynb",
   "provenance": [],
   "toc_visible": true
  },
  "environment": {
   "kernel": "python3",
   "name": "tf2-gpu.2-6.m91",
   "type": "gcloud",
   "uri": "gcr.io/deeplearning-platform-release/tf2-gpu.2-6:m91"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
